ECE 4332 presentation notes. 12/01/11

Start time: 2:05. Done 2:30. Excellent time management.

NOTES:

1Mb low power SRAM. Goal is to minimize power

Claim: Successful design.

Approach: Optimized Architecture, bitcell, model sims, layout

Architecture:

Research (prior art, textbooks, prior years)

64 blocks.

Bitcell optimizations. Optimize for low power. Sims for sizing of transistors -> minimum device sizes worked ok from stability point of view.

Layout – focus on small area to reduce cap and cost. Mirrored design. Wide cell layout. Nice layout.

Vertical VSS, VDD. BLs vertical on sides of cell. WLs horizontal. Area: 0.816um^2 with own PMOS design. (Why not compare area to prior work from last year or papers?)

read bitcell – wanted to lower VDD. Looked at read SNM across process corner. Assumed 10% error on VDD noise for setting tolerable SNM.

Could do a dynamic change of VDD during hold. Monte Carlo sim using local variation for hold SNM. Chose 350mV based on MC for hold sleep mode.

Model for block, row, column #s. Claim that the model accurately represents the SRAM circuit as a whole. Model for one block: normal bits in each corner. 4 multiplied load cells. C from extraction. R from PDK.

Included scalable global WL and read / write mux.

2 SA options: from the Rabaey book and a paper. Tested both for delay (assumed area and energy can be neglected). Picked strong arm amp for better delay. Used base 6T layout for layout and added extra stuff.

Decoder optimization: naïve version is parallel AND gates. Use predecoders (common practice). Difficult to model variable decoders. Optimized decoders later and adjusted model results.

Used the model to find best B, R, C. Added decoders in. Chose 64 blocks. (128 block choice makes demux and mux larger)

Timing Diagram. Text WAY TOO SMALL – need to annotate better for slides. nice looking curves, though.

LAYOUT: Got all the blocks of cells. Roughly square overall design. Periphery circuits in the middle to reduce wire lengths. Lower level layouts done (not all put together). Pitch matching (good job explaining it!!) of Read/Write Mux and SAs. Top level layout floorplan: the periphery isn’t done, and as it is shown, the BLs from 4 blocks have to funnel over each other to get to the column periphery – this won’t work very well if at all.

Simulated entire model and WORKS at all corners and range of VDDs. Range of Temperatures. SS forced longer clock period for read. 0.54V could be problematic if accounted for local variation. Is this 1 bit or 32bit sim? (Answer – 1 bit sim). Is the metric counting just 1 bit operation or all 32 bits? (Answer - ?)

ECC as extra bonus circuit

Metrics:

Ewr = 1.57pJ, Erd = 1.43pJ

Delay = 2.9ns

Metric: - how does it compare with prior work?

Answer: Last year’s low power SRAM was also at 0.6V, but in fJ range – seems very suspicious

PRESENTATION

Well practiced. Ties!! Nice looking slides. Good qualifications and appropriate level of detail. Good explanations.

Need to cite sources for figures that aren’t yours.

QUESTIONS

Layout: SAs under 4 blocks? -> Floorplan revision for periphery

Metric: Any good? Higher power than last year’s low power group.

E-D curve for your design as f(VDD)? Metric as f(VDD)? This would be nice to add.

Equation for constructing Etotal, Dtotal. Show comparison with prior years. Show how account for 32 bits. Comment.

What was key lesson? What did you learn? (Model; Reuse as much as possible from prior work, but tricky;)

What would you do differently next time?

Are you proud of this design?

No RC on WL in model? Why?

How much of schematic is done? How much layout is done?

SKILL code?